LDA-based Topic Modelling in Text Sentiment Classification: An Empirical Analysis
نویسندگان
چکیده
Sentiment analysis is the process of identifying the subjective information in the source materials towards an entity. It is a subfield of text and web mining. Web is a rich and progressively expanding source of information. Sentiment analysis can be modelled as a text classification problem. Text classification suffers from the high dimensional feature space and feature sparsity problems. The use of conventional representation schemes to represent text documents can be extremely costly especially for the large text collections. In this regard, data reduction techniques are viable tools in representing document collections. Latent Dirichlet allocation (LDA) is a popular generative probabilistic model to represent collections of discrete data. In this regard, this paper examines the performance of LDA in text sentiment classification. In the empirical analysis, five classification algorithms (Naïve Bayes, support vector machines, logistic regression, radial basis function network and K-nearest neighbor algorithms) and five ensemble methods (Bagging, AdaBoost, Random Subspace, voting and stacking) are evaluated on four sentiment datasets.
منابع مشابه
Probabilistic topic models for sentiment analysis on the Web
Sentiment analysis aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text, and has received a rapid growth of interest in natural language processing in recent years. Probabilistic topic models, on the other hand, are capable of discovering hidden thematic structure in large archives of documents, and have been an active research...
متن کاملA High-Performance Model based on Ensembles for Twitter Sentiment Classification
Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملEfficient Method Based on Combination of Deep Learning Models for Sentiment Analysis of Text
People's opinions about a specific concept are considered as one of the most important textual data that are available on the web. However, finding and monitoring web pages containing these comments and extracting valuable information from them is very difficult. In this regard, developing automatic sentiment analysis systems that can extract opinions and express their intellectual process has ...
متن کاملSentiment Classification: A Topic Sequence-Based Approach
With the development of Web 2.0, sentiment analysis has been widely used in many domains, such as information retrieval (IR), artificial intelligence and social networks. This paper focuses on the task of classifying a textual review as expressing a positive or negative sentiment, a core task of sentiment analysis called sentiment classification. To address this problem, we present a novel sent...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Comput. Linguistics Appl.
دوره 7 شماره
صفحات -
تاریخ انتشار 2016